Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization
نویسندگان
چکیده
SUMMARY We present a tool for control-free copy number alteration (CNA) detection using deep-sequencing data, particularly useful for cancer studies. The tool deals with two frequent problems in the analysis of cancer deep-sequencing data: absence of control sample and possible polyploidy of cancer cells. FREEC (control-FREE Copy number caller) automatically normalizes and segments copy number profiles (CNPs) and calls CNAs. If ploidy is known, FREEC assigns absolute copy number to each predicted CNA. To normalize raw CNPs, the user can provide a control dataset if available; otherwise GC content is used. We demonstrate that for Illumina single-end, mate-pair or paired-end sequencing, GC-contentr normalization provides smooth profiles that can be further segmented and analyzed in order to predict CNAs. AVAILABILITY Source code and sample data are available at http://bioinfo-out.curie.fr/projects/freec/.
منابع مشابه
CODEX: a normalization and copy number variation detection method for whole exome sequencing
High-throughput sequencing of DNA coding regions has become a common way of assaying genomic variation in the study of human diseases. Copy number variation (CNV) is an important type of genomic variation, but detecting and characterizing CNV from exome sequencing is challenging due to the high level of biases and artifacts. We propose CODEX, a normalization and CNV calling procedure for whole ...
متن کاملPaSD-qc: quality control for single cell whole-genome sequencing data using power spectral density estimation
Single cell whole-genome sequencing (scWGS) is providing novel insights into the nature of genetic heterogeneity in normal and diseased cells. However, the whole-genome amplification process required for scWGS introduces biases into the resulting sequencing that can confound downstream analysis. Here, we present a statistical method, with an accompanying package PaSD-qc (Power Spectral Density-...
متن کاملQuicK-mer: A rapid paralog sensitive CNV detection pipeline
QuicK-mer is a unified pipeline for estimating genome copy-number from high-throughput Illumina sequencing data. QuicK-mer utilizes the Jellyfish application to efficiently tabulate counts of predefined sets of k-mers. The program performs GC-normalization using defined control regions and reports paralog-specific estimates of copy-number suitable for downstream analysis. The package is freely ...
متن کاملModeling read counts for CNV detection in exome sequencing data.
Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov mode...
متن کاملCorrelation of HER2, MDM2, c-MYC, c-MET, and TP53 Copy Number Alterations in Circulating Tumor Cells with Tissue in Gastric Cancer Patients: A Pilot Study
Background: The analysis of the gene copy number alterations in tumor samples are increasingly used for diagnostic and prognostic purposes in patients with gastric cancer (GC). However, these procedures are not always applicable due to their invasive nature. In this study, we have analyzed the copy number alterations of five genes (HER2, MDM2, c-MYC, c-MET, and TP53) with a fixed relevance for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 27 شماره
صفحات -
تاریخ انتشار 2011